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Abstract 

The optimal control of unitary transformations is a fundamental problem in quantum control 
theory and quantum information processing. The feasibility of performing such optimizations is de- 
termined by the computational and control resources required, particularly for systems with large 
Hilbert spaces. Prior work on unitary transformation control indicates that (i) for controllable sys- 
tems, local extrema in the search landscape for optimal control of quantum gates have null measure, 
facilitating the convergence of local search algorithms; but (ii) the required time for convergence to 
optimal controls can scale exponentially with Hilbert space dimension. Depending on the control 
system Hamiltonian, the landscape structure and scaling may vary. This work introduces methods 
for quantifying Hamiltonian-dependent and kinematic effects on control optimization dynamics in 
order to classify quantum systems according to the search effort and control resources required to 
implement arbitrary unitary transformations. 



1 Introduction 

The methodology of optimal control theory (OCT) has been applied to achieve various dynamical objec 
tives in quantum systems by manipulating constructive quantum wave interference to maximize the like 
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lihood of attaining desired target states pQ. Three classes of problems - state control, observable control, 
and unitary transformation or gate control 2 - have received the most attention in the quantum control 
community to date. The generation of targeted unitary transformations is of both fundamental inter- 
est and has direct applications to quantum information sciences since the quantum logic gates required 
to carry out quantum computation are represented by unitary transformations [3]. Since U(N) (and 
SU(N)) are compact Lie groups, it is possible to generate any U £ U(N) through sequential application 
of elements of a complete set of generators iH\, • • • , iH^ for U(N), i.e., W = exp(iHktk) ■ ■ ■ exp(iHit±). 
This strategy (uniform finite generation) is now commonly applied in gate decomposition strategies 
wherein the unitary gate W expressed in terms of n qubits (i.e., with a corresponding 2™ dimensional 
Hilbert space) is constructed through sequential application of various Uj = exp(iHjtj) which each act 
on only 1-2 qubits [H [SJ. However, provided the system is controllable, it is also possible to generate 
any W by shaping time-dependent control functions £j(t), j = 2, • • • , k over a time interval [0, T], Each 
control function is coupled to a corresponding control Hamiltonian iHj, j — 2, • • • , k, which are all 
simultaneously applied in order to generate the desired W at time T. This is the method of optimal 
control theory. Typically, uniform finite generation requires a greater total evolution time T than OCT 
methods based on pulse shaping [J]. 

OCT has been applied to unitary transformation control for the purposes of quantum computation 
in a variety of quantum systems. Kosloff and coworkers studied the implementation of quantum gates 
based on vibrational eigenstates of the molecular sodium ion Na^ on the ground electronic surface 
[5J E] . Similar studies were carried out similar studies in the acetylene molecule using the asymmetric 
C-H stretching and bending modes by de Vivie-Riedle and coworkers [TJ. Gate control on spin-system 
dynamics with optimally designed NMR pulses has been performed by several groups [HI [HI EH fTTj . 
Deutsch and coworkers implemented unitary maps on the magnetic sublevels of the ground electronic 
state of cesium [12] . The computational studies using OCT can be extended to the laboratory by shaping 
ultrafast laser fields using Optimal Control Experiment (OCE) [T3J to generate control functions e(t). 

An important issue in determining the feasibility of optimally constructing unitary transformations 
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is how the required search effort scales with the Hilbert space dimension of the system under control. 
Whereas state control and maximization of observable expectation values have met with widespread 
success in experimental and computational incarnations [H H3] , with search effort generally invariant to 
the Hilbert space dimension [151 116) , the achievement of high fidelity unitary transformations has proved 
more challenging [SHH1IIS], especially for large systems. 

Recently, a series of fundamental studies have been carried out on the underlying properties of 
quantum control landscapes, defined as the map between the external control field and the objective 
fidelity [5J [T71 [TBI EH [20] ■ These studies revealed that under reasonable assumptions and in the absence 
of auxiliary costs (e.g., on the field fluence) or constraints, the control landscape contains no sub-optimal 
extrema, or "traps" that can hinder a gradient-based local algorithm for finding an optimal control field. 
The importance of the landscape topology to determining the feasibility of quantum control is beginning 
to be more widely recognized [12] . The topology of quantum control landscapes is dominated by so-called 
kinematic extremals, which are determined by the cost function alone and independent of the Hamiltonian 
used [T7J Uni [20] ■ Although the trap-free landscape topology ensures convergence of unconstrained 
gradient-based algorithms given sufficient effort, the topology does not specify the convergence rate of 
such algorithms, which additionally depends on the local landscape structure (e.g., slope and curvature). 
In this paper we examine the effects of control landscape features on the convergence rate of first-order 
algorithms for the optimal generation of unitary transformations. The primary goals are (i) to assign 
different Hamiltonians to classes exhibiting exponential or sub-exponential scaling with the system size, 
and (ii) to quantify the effects of the local landscape structure on the convergence rate. 

The paper is organized as follows. In Section [21 the theoretical formulation of the control problem 
is presented, including a summary of the associated landscape topology. Section 3 provides a frame- 
work that unifies first-order OCT algorithms for gate control, demonstrating that the convergence of all 
these algorithms is governed by the same underlying landscape topology. Section 4 presents metrics for 
landscape slope and curvature, including kinematic bounds, and dynamical metrics that quantify the 
effect of the control system Hamiltonian. Section 5 defines the control systems and target propagators 
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used in the simulations. Section 6 presents numerical results on control optimization search effort and 
resource scaling with respect to Hilbert space dimension and identifies classes of Hamiltonians exhibiting 
exponential and sub-exponential scaling, while Section 7 relates search effort and resource scaling to local 
landscape structure. Finally, in Section 8, we draw conclusions from the findings. 



2 Optimal Control Theory 

2. A Dynamical Formulation of the Control Objective 

Consider an iV-dimensional isolated quantum system whose dynamics are governed by the time-dependent 
Schrodingcr equation, 

ih?^ = H ( K ,t)U(t), E/(0,0) = I, (1) 

where H(K,t) is the time-dependent Hamiltonian whose control variables are denoted as k. In atomic 
units, the unitary propagator at some final time T is 



U(T) = Tcxp jf 



H(K,t)dt] , (2) 



where T is the time-ordering operator and U(T) is implicitly understood to be a function of k, which in 
this work is represented by an external control field k — > e(t). We consider an isolated quantum system 
satisfying the dipole formulation H(k, t) = Hq — (ie(t) where H is the field-free (drift) Hamiltonian and 
li is the dipole or control Hamiltonian operator. 

This work concerns the class of control objective functionals 

J[e(-)]=F(U(T)). (3) 

The endpoint control objective F may be defined as guiding the system's unitary propagator U to match 
a pre-specified unitary matrix W. A convenient cost function is to minimize the Hilbert-Schmidt distance 
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F(U) = \\W-U\\^ 

F(U) = \\W-U\\ 2 

= Tr[(W -U)\W -U)] 

= 2N -2ftTr(W^U), (4) 

where the desired minimumO? 1 = is achieved when U — W, and the global maximum F = AN 
corresponds to U = —W. The quantum system is assumed to be controllable such that any desired U 
can be generated by some choice of e{t) at time T. (see Section |4~E|) . 

In this work, the objective F of Eq. Q is optimized using dynamical controls present in an external 
electric field e(t). We consider a controllable quantum system with N levels |1),...,|JV) in Hq. To 
determine an optimal control e(t) that maximizes or minimizes Eq. ([3]), it is useful to define a Lagrangian 
functional J that directly imposes the dynamical constraint in Eq. ((TJ): 

J = 2N-2ReTr{W^U(T)) + Tr ^ (t) (-i(H - fis{t))U(t) - 

= 27V - 2Re Tr(W^U(T)) - iTr(<ff(T)U(T)) + zTr(0t(O)l7(O))+ 
T H(U(t),<f>(t),s(t)) +Tr (&lu(t)) dt, 



dt (5) 



where 4>(t) is a Lagrange multiplier matrix function and 4>(T) = U(T)U'(t)<f>(t). Denoting by (A, B) the 
Hilbert-Schmidt inner product Tr{A^ B), the first integrand term 

H = -(U\T)cb(T),iU\t)H Q U(t))+e(t)(U\T)cf > (T),tU^t) f iU(t)) 

is the PMP (Pontryagin maximum principle) Hamiltonian function 21, 22 . A necessary condition for 
maximizing or minimizing Eq. ([3]) subject to the dynamical constraint is satisfaction of the first-order 
conditions (Euler-Lagrange equations) for the Lagrangian J [35]. The first Euler-Lagrange equation is 
simply the Schrodinger equation ([1]). The second Euler-Lagrange equation of (JSJ) is 

d<j>(t) 



dt 



= -i(H -i*e(t))<Kt), (6) 



The minimum of F corresponds to gate fidelity -±-9iTr(iy+(7) = 1. 
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where 4>{T) satisfies the boundary condition tj>(T) = Vu(t)F(U(T)). For F(U) given by Eq. (g]), we 
have [23] 

V U(T) F(U(T, 0)) = U(T)W^U(T) - W. (7) 
The third Euler-Lagrange equation (critical condition) is _ q p Qr a con |; ro i system satisfying Eq. 
©, 

ait 

^=-iTr([/t (T) ^ (T )[/t (t)/i[/W ) 

= -jTr(D0[/(T) - tft(T)W] U\t)iiU{t)) = 0. (8) 

Consider the control-propagator map Vt ■ £(■)<-> U(T) and the composition of maps J = F o Vr- 
Then, the functional derivative evaluated at time t is denoted as = gp|r . For simplicity of 
exposition, we use the symbols J,e(t) interchangeably with J, e(-), respectively, and refer simply to the 
derivative jT^y- Control fields that satisfy Eqs. ([1]), ([5]), and © constitute the critical points of the 
control landscape J is). Since the problem min J with J given by ([3]) is underdetermined, these critical 

e(t) 

points lie on critical submanifolds, each consisting of an infinite number of fields which produce the same 
value of J. 

2.B Critical Topology of J 

The matrix -iU(T)W(t)fj,U(t) which appears in equation (JS| is the functional derivative s >./ . A 
simplifying condition that facilitates extraction of the critical topology of J(e(-)) is that the iV 2 x TV 2 
Hcrmitian Gramian matrix 

G = f T v [[/(T>(i)] v T [KtWHT)] dt, (9) 
Jo 

is full rank [24]. Here, i/ denotes the "vectorization" of an N X N complex matrix into an iV 2 -component 
complex vector and = -iW(t)fiU(t). Note G^ )W = £ {i\U(T)(i(t)\j) (k\fj,(t)W (T)\l) dt. Satisfac- 
tion of the full-rank condition ensures that ([5]). which may be written (VF(U(T)) 7 ^ggi) = 0, implies 
VF(U(T)) = (see Section f4 . B [) . The condition is verified numerically for diverse classes of quantum 
control systems in Section 7. 



When this condition is satisfied, the critical topology (number of local optima and their optimal- 
ly status) of the control landscape J(e(t)) is equivalent to that of F(U) in Eq. (Q} [PS] US]. The 
control-propagator map Vt associates with each critical submanifold of F(U) a critical submanifold of 
J(e(-)) whose number of positive and negative Hessian eigenvalues is identical, but which has an infinite- 
dimensional nullspace [3D]. The topology may thus be analyzed by considering the kinematic degrees 
of freedom, using for example, the A^ 2 matrix elements of U as controls. It has been shown that there 
axe N + 1 distinct critical values of F = 2N - 2Re Tr (W^U), F=0, 4, 8, ... AN corresponding to crit- 
ical points U where VF(J7) = [THUS]. The topology of these critical points can be determined by 
considering the Hessian operator H of F, 

% ^U) = ^ (10) 

where the {xi} are a suitable set of local kinematic coordinates around the critical point U. Of interest 
is the number of the positive, negative, and zero eigenvalues of the Hessian at U, which correspond to 
the number of upward, downward, and flat directions at that point. 

The Hessian eigenvalue enumeration may be obtained from the Hessian quadratic form (HQF) Q of 

F: 

Q A {W^U) = 4ReTr(A 2 l^ t L>), (11) 
where A is an infinitesimal Hermitian matrix |25) . At a critical point, it may be shown |25] that 

N 

W^U = Rj2 s k\k)(k\R\ (12) 
fe=i 

for some unitary R and where 5i = ±1. Evaluating the HQF explicitly at a critical point and writing 
the elements of A as Aij = + ify yields ^V 2 terms (N terms in the first sum and A^ 2 — A* terms in 
the second sum) corresponding to the A^ 2 eigenvalues of the Hessian [35] , 



Q A (W^U) = 4 



N 

3=1 l<k<e<N 



(13) 



The sign of each term in Eq. (fT3|) corresponds to the sign of each Hessian eigenvalue. It has been shown 
[231 121>] that for a critical point with an objective value of F = 4m for m = 0, 1, 2, . . . , N, the number of 



positive (h + ), negative (ft--) and zero (ho) type of eigenvalue is 

h+ = (N - to) 2 ; /i_=to 2 ; h = 2Nm-2m 2 . (14) 

For F=0, m=Q, there are N 2 positive eigenvalues, indicating that the optimum is an isolated point. 
Similarly, for F=4N, where m = N, there are iV 2 negative eigenvalues. For intermediate values of F, 
there are a mixture of positive, negative, and zero eigenvalues, indicating that all intermediate critical 
points have a saddle topology. For example, at F=4, m=l, and h + = (N — 4) 2 , h- = 1, and ho — 27V — 2. 
Assuming that minimization of F is desired, this saddle point may be expected to pose a hindrance to 
search effort, as there is only one direction out of TV 2 leading down to the global minimum. The higher 
saddles contain more negative eigenvalues and thus are expected to pose less of a hindrance in search 
effort. This matter will be examined in Section [6.AI 



3 Optimization Methods 

For unitary transformation control, deterministic first-order algorithms are typically used for control 
optimization. In this Section we compare these first-order algorithms and demonstrate that they share 
a common fixed point topology. In Section 4 we extend these results to demonstrate that the algorithms 
share common bounds on their convergence rates to identify optimal controls. 

The simplest first-order algorithm is the gradient flow of the objective function. Using the variable s 
to index the search path, the gradient flow trajectory is the solution e(s,t) to the initial value problem 

de(s,t) 6J(s) 

^^ = a(s) &M)' (15) 

for a specified initial guess for the control e(0,i), where a(s) is an adaptive step size. Associated with 
the control field trajectory e(s, t) is a trajectory for U(s, T) in U(N) for the final dynamical propagator, 
which is induced by the control-propagator map Vy. In the numerical simulations in this work, Eq. (|15[) 
will be solved using a variable step size fourth order Runge-Kutta integrator built into MATLAB [27] . 
A primary concern in this paper is the convergence rate of such algorithms, whose fixed points include 
all e(t) such that U(T) = U in Eq. (fE 



As s — > oo, e(s, i) converges toward stable fixed points e(t) that are critical points of J. These points 
are neutrally stable, i.e., within any neighborhood N of s(t) consisting of controls e(t) such that 

\\e(t)-s(t)\\<e, 

there exists a subneighborhood N' C N such that if e(0,<) £ JV', e(s, t) 6 TV for all s. 

The neutrally stable e(f ) solutions are the global optima of J, which can be seen from the correspond- 
ing trajectory U(s,T) induced by the map Vr- U(s,T) converges to asymptotically stable fixed points 
that are optima of F(U) - points U such that 

U{0,T)-U <5^ lim U(s, T) = U, 

s— »oo 

for some 5 that is equal to the radius of the attracting region of the fixed point. The latter are the 
critical points identified in Eq. (fT2)) with positive definite Hessian (fl3|) - and according to ([T4]) . the only 
critical point satisfying this criterion is the unique global optimum W . Due to the asymptotic stability 
of U = W, any e(0, t) such that U(0, T) — Vr(e(0, t)) is within the attracting region of W will converge 
to a neutrally stable fixed point e(t) that lies on the global optimum submanifold of J. The instability 
of fixed points i(t) lying on other critical submanifolds with J > follows from the indefiniteness of the 
HQF at U ^ W. 

Many unitary control studies use so-called PMP-iterative algorithms [55], which can be formulated 
only in discrete time. These algorithms iteratively integrate equations ^ and ([T|) at each step k, using 
control fields e k (t) = a k e k -i{t) + ^ k (^k(t)\n\U k -i(t)), e k (t) = a k e k {t) + f3 k ((f) k (t)\fi\U k (t)) , respectively, 
where a, /3 are scalars. The fixed points of these algorithms are points on the control landscape where 
£k — £fc = or e k — e k -\ = 0. In Appendix fAI we show that under appropriate regularity conditions the 
only neutrally stable e(t) lie on the global optimum submanifold where U(T) — W . A third class of gate 
control optimization algorithms consists of first-order tracking algorithms which follow a prescribed path 
in the space of propagators to the target gate; these have been shown to be capable of achieving gate 
fidelities approaching machine precision [23) . In this work we focus on the application of steepest descent 
algorithms (115p , due to the mathematical convenience of formulating convergence to a stable fixed point 
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for these continuous time algorithms, but our conclusions on convergence efficiency are applicable to 
PMP-iterative and tracking algorithms as well. 

4 Landscape Structure Metrics 

The landscape topology summarized in Section I2.BI suggests that an optimal field to achieve a desired 
unitary transformation may be readily found because no suboptimal extrema exist on the landscape. 
This attractive behavior does not preclude the possibility that complicated landscape features, including 
strong influence by saddle regions, may impede optimal searches. Thus, an understanding of the effects of 
the local landscape structure on optimal searches is necessary in order to explain and predict the scaling 
of search effort with system size N. We introduce landsape metrics and show that the same landscape 
local structural features govern the convergence of PMP-iterative and gradient-based algorithms. 

4. A First-order metrics 

The local structure metrics of the landscape are based on a Taylor expansion J(e(s,t) + 5e(s,t)) = 

J(e(s, t)) + J T V e J(e)Se(s, t) dt+ § J Q T J Q T U(t, t')Se(s, t)Se(s, t') dtdt' H of the cost functional J with 

respect to e{s,t). The slope metric Q m at a point s m on the landscape is defined as 



the gradient on the landscape at the mth point. Beginning from the expression in Eq. 1161 Q m at any 
point m is bounded by 




(16) 




. The metric Q m is thus equivalent to the magnitude of 





T 



1/2 



< 



dt{\{W^U{T),pL{t)) + {U\T)Wm\f 



< 




2Ns/T\\n\\. 



(17) 
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Above, the Cauchy-Schwarz inequality is used twice. A greater value of Q m at m = 0, 1, 2, . . . results in 
a locally faster descent. 

We can also establish a bound on \sk(t) — s k (t)\ in PMP-iterative algorithms (see Appendix |A"|) : 

\(Mt)\v\U k -i(t))\ = \Tr[Ul(T)(W- U k ^{T)W^U k ^{T))ul{T)),ul_^U k m\ 

< \u k {t)Ul{T)WU k - X {t)\\ ||m|| + \\u k (t)ul{T)U k ^(T)W*U k - 1 (T)U k ^ 1 (t) 

< 2N\\(i\\ . 

Thus, 

r< fc )(t)-e«(i) <27V%/T||/i||. 

In PMP-iterative algorithms, the bound on the increment in the field for infinitesimally small step length 
is equivalent to that for steepest descent with a Mayer cost. 

4.B Second-order metrics 

From the second variation of the objective functional J, we may derive the Hessian kernel; the elements 
of the Hessian are given by [53] 

H(t,t') = Tr [W^U(T)n(t)n(f) + (T)W fj,(t')fj,(t) + (W f U(T) - U\T)W)W)^{t')]} . (18) 

At a critical point, the last term of Eq. (fT5|) drops out. The relationship between the HQF expression 
(fTTj) and (fT8|) is described in Appendix IB"1 

In steepest descent algorithms, according to the gradient expression ([5]), Se(t) is composed of linear 
combinations of the real and imaginary components of ^ij{t). The eigenvalues of the Hessian (|18p specify 
the rates at which new frequency modes required for optimal control (contained within the eigenfunctions 
of H(t,t')) can be added to 6e(t). Thus, several bounds on the Hessian are given below. 

The Hessian trace or mean curvature is given by 

TrU{t,t') = [ dtn{t,t). (19) 
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At a critical point, 

TrH(t,t') = f dm [W^U{T)U\t)^U{t)U^{t)nU{t) + U\T)WU ] '{i)^U{t)U ] '(t)p,U(t)] 
Jo 

= [ dm[W^U(T)U^(t)[i 2 U(t) + U(T)W'<U'<{t)ti 2 U{t)] (20) 
Jo 

At J=0, W^U(T) = I N , and Eq. O becomes 

TrH(t,t')\ = 2 f dm [U\t)fi 2 U(t)] 
Jo 

f T 

= 2 / dm [u\t)u(t)n 2 ] 

Jo 

= 2TTr^i 2 (21) 

where the second step uses the cyclic permutation trace rule. Similarly, at the maximum J=AN , the 
trace is given by the negative value of Eq. (|2"Tj) . 

We can also calculate a bound on Hessian mean curvature away from a critical point: 

Tr(H(t,t')) = [ dm[W^U{T)U^{t)n 2 U{t) + U{T)W^U{t)n 2 U{t)} 
Jo 

= [ dt(\\w'<U(T)\\\\n(t) 2 \\ + \\u'<(T)W\\\\n(.t) 2 \\) 
Jo 

= 2ivr||^ 2 || . 

Finally, we consider the local curvature, or the projection of the Hessian matrix on to the normalized 
gradient vector u^, 

C m u m • H. • . (^^) 

The curvature near the optimum may influence the required search effort by determining the ease of 
convergence to the optimum. Note that since the gradient and Hessian can be expressed in terms of 
the same N 2 basis functions of time, only L 2 inner products of components of the time-evolved dipolc 
operator contribute to the local curvature of the control landscape. 

4.C Distance metrics 

On a search trajectory, the field starts out at algorithmic index s=0 with e(0,i) and progresses in steps 
s — > s + ds (i.e., e(s,t) — > e(s + ds,t)) until the trajectory ends at an optimal field, e op t = e(sM,t) at 

12 



s = sm- The complexity of the search may be characterized by the ratio of the trajectory path length 
||Ape(f)|| to the Euclidian distance between initial and final control fields ||Ape(i)||, 



R _||Ape( t )||_ Jo"*>lSo* 



2\ V2 

de(s,t) 
ds 



(23) 



The closer R e is to unity, then the more direct the path, i.e., the closer the path is to a straight line in 
search space. This metric will be used to assess the complexity of the search trajectories followed during 
optimizations. 

Since the presence of saddle manifolds on the landscape may influence the efficiency of an optimal 
search, the distance of points on the search trajectory to the nearest saddle also provides important 
structural information. This distance may be measured by examining the eigenvalues of the matrix 
V = W^U, since these eigenvalues are all ±1 at any saddle (all +1 at J—0 and all —1 at J=AN). If Ei 
are the eigenvalues of V, a convenient metric to express the distance to the nearest saddle is 

N 

5 = ^^(1-1^1), (24) 
i=i 

where the normalization factor Af is Af = 2/ J if J < 2N and Af = 2/(4JV - J) if J > 2N , which makes 
the maximal allowed value of S equal to 1 for any J value. Finding that S is close to zero near the 
saddle values (J=4, 8, 12, ...) indicates that the search trajectory encounters the saddle manifold at this 
J value. The effects of search trajectories approaching saddle manifolds will be examined in detail in 
Section 

4.D Gramian matrix 

Unlike the Hessian, which depends on F(U) as well as the system Hamiltonian, the Gramian matrix G £ ,t 
([9]) provides a means of characterizing purely dynamical effects on the optimization trajectory. Consider 
the equation for the controlled propagator U(s, T) corresponding to the gradient flow (|15p . Denoting by 
u(s,T) the vectorization of the propagator U(s,T), we have [2"4"] : 

= G £s . T V u F(u(s, T)), (25) 
13 



from which it can be seen that G £i t is a linear map between vectors in TjjU (N) and all system-dependent 
effects. The eigenvectors of the Gramian are N 2 orthogonal directions in the tangent space to the unitary 
group TjjU(N). The magnitudes of the eigenvalues of G £j t represent the dynamical contributions of the 
corresponding orthogonal directions in TuU(N) to the propagator variation 5U(s,T) induced by the 
gradient flow control variation Se(s,t). 

If the eigenvalues of G 6j t are always sufficiently far from zero, corresponding to a well-conditioned 



matrix, then 



SJ 



can be infinitesimally small only near the global optimum submanifold, irrespective 



Se(t) 

of the direction of VjjF(U(T)), which facilitates convergence. This can be seen as follows: the critical 
point condition V J — implies 

(VuF(U(T)), ^> = 0, Vt; 

then if Vr/F([7) ^ at a critical point, then rank G £y r < N 2 . Since the only neutrally stable fixed 
points of flow (|15[) lie on the global optimum manifold, the claim holds. 

More generally, control systems for which the expected values of the condition number of G 6i t are 
low near the global optima typically exhibit faster convergence, as shown in Section 7. A special case is 
where G e .t is degenerate; then, the dynamical contribution of each eigenvector in TjjU{N) contributes 
equally when the control variation is integrated over time and convergence is governed by the kinematic 
gradr 



any wi 
iienfl 



4.E Higher-order analysis 

The above metrics and associated bounds characterize the local properties of the control landscape. 
Properties of the global solution to the gate control problem can be analyzed using the Dyson series 
expansion for the controlled unitary propagator in the interaction picture |29) : 



2 In prior work on gate control optimization, the target gate W was sometimes chosen to reside in a subspace of the 
dynamical Hilbert space [5j [6] . In this case, operators on the Hilbert subspace need not be unitary (rather, they belong to 
the class of Kraus operators or positive trace- preserving maps), and Hamiltonian-dependent contributions to optimization 
efficiency are governed primarily by the eigenvalues of the Gramian matrix (0 on that subspace rather than the entire 
Hilbert space. 
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Ui(T)=I N +i f T V^{t l )fiV(t L )e{t L )dt l + f T V^t^nVit^eit 1 ) f V^(t 2 )fiV(t 2 )s(t 2 ) dt 2 dt x + ■ ■ ■ 
Jo Jo Jo 

(26) 

where V(t) — exp(— iH$t). The n-th term in the Dyson expansion corresponds physically to the set of 
possible n-photon transition pathways between eigenstates over time [0,T]. Note that each successive 
term in (|26|) contains higher order products of exp(iHot)^,exp(—iHot) and e(t) than the previous terms. 
For any bounded field fluence s 2 (t) dt and tolerance c, the series converges to Ui(T) at some finite 
order [2"9] . 

The matrices Hq and /i, which define the control system, also fully determine the minimal order in 
series (|26p required to produce any given W. Let £{Hq,(i} denote the dynamical Lie algebra of the 
quantum control system - i.e., the Lie algebra spanned by repeated commutators of Hq and fj,: 

£{H , fj.} = span {H ik , • ■ ■ , {H i2 , H^}}, Hi G {H , m}- 

Beyond some critical k (called the depth of £), which depends on the control system, the dynamical Lie 
algebra saturates [4]. For bilinear quantum control systems, a sufficient condition for full controllability 
- i.e., the existence of a control field e(t) such that the corresponding U(T) induced by the Schrodinger 
equation can be any unitary matrix - is that rank C{Hq, /i} = iV 2 [H 130) . 

Higher-order terms in series (|26l) correspond to higher order commutators, and hence generally require 
higher field fluences (higher amplitudes of the corresponding Fourier modes). If these amplitudes in s(t) 
are below certain minimal values required for the corresponding Dyson series terms to be nonnegligible, 
it is not possible for the field to be a solution to the gate control problem. In general, systems with 
control Hamiltonian (dipole) operators fi that systematically exclude distant transitions require higher 
order terms in the Dyson series to reach any arbitrary gate W, which increases the nonlinearity of 
the optimization problem for these control systems with greater Lie algebra depth k. This results in 
the optimal controls e(t) containing more frequency modes, corresponding to more complex control 
mechanisms [29) . In Sections 6 and 7, we demonstrate that optimizing with control Hamiltonians with 
weak or forbidden transitions between distant quantum states yields higher-fluence and more complex 
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optimal controls e(t), which require a greater algorithmic search effort to find. 

Equation (|26p also determines when the Gramian matrix ^ may be nonsingular. Only for controls 
e(t) where all the terms in the Dyson series required for full controllability of a given Hq and /i have 
become populated can the Gramian matrix be well-conditioned. This is quantified by the relation^] 

rank G Ey r < rank £{H ,n}- 

In Section 7, the effects of control system Hamiltonians on Gramian condition number are studied 
numerically. 



5 Control systems 

An exhaustive sampling of system structures for the drift Hamiltonian Hq and the control Hamiltonian fi 
is impractical. Here, we choose two Hq and four /i structures motivated by common propagator control 
systems, which differ qualitatively in their Lie algebra depth, as described in Section 4.E. The goal is to 
provide an overview of search behavior that might be expected without making any specific predictions 
for the behavior of any particular quantum system. In order to delimit the space of drift and control 
Hamiltonians studied, only diagonal Hq structures are considered, with structural variation restricted to 
the control Hamiltonians \x. Although control systems requiring multiple fields to ensure controllability 
(e.g., coupled spin systems) may be used to realize quantum computation we consider only systems 
controllable by a single field here so that the search behavior across different systems can be directly 
compared. 

We consider an iV-level quantum system in arbitrary dimensionless units. Two model systems with 
Hq in its diagonal basis are considered. First is that of a rigid rotor, 

N-l 

H =^2-j(j + l)\j)(j\, (27) 
and second is that of an anharmonic oscillator, 



N-l r 



3=0 



,2 



^(j + l/2)-^(j + l/2) 2 



m\, (28) 



5 We prove this result, as well as other necessary conditions for nonsingularity of the Gramian, in a separate work I31| . 

l(i 



with lu = 20 and V = 2000. 

Four physically relevant real matrix control Hamiltonian structures /x will be considered, paired with 
one of the two Hq structures above. Control Hamiltonians of different matrix distributions of off-diagonal 
elements were chosen because of their different Lie algebra depths and optimal control mechanisms, as 
discussed in Section 14. El All of the control Hamiltonian structures used here have nonzero trace in 
order to make the systems controllable on U (N) and not only on SU(N). For many physical systems the 
coupling between states decreases as the difference between the quantum numbers of the states increases, 
and the first choice of fj, takes this property into account, with the following structure 
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V = 



a 
1 

D 

D 2 



D 
1 



1 

D 



D 2 
D 
1 



D N-2 
D N-3 
D N-A 
D N-5 



(29) 



y D N-2 D N-3 D N-A D N-5 Q J 

where a > 0, D G [0, 1] is the coupling parameter, and all elements of /i have a random phase of 
±1 with the restriction that \x remains symmetric. This "D" structure qualitatively corresponds to 
diatomic molecules and other anharmonic vibrational systems. The second control Hamiltonian structure 
examined is the related "banded" structure where a fixed number of rows nearest to the diagonal have 
elements of ±1 with the remaining rows having elements of zero; the extreme of having only one row with 
allowed transitions qualitatively corresponds to a harmonic oscillator or rigid rotor. Third, we consider a 
"sparse" structure with 50% of the off-diagonal elements randomly chosen as ±1 and the remaining 50% 
of the off-diagonal elements being zero, while maintaining fi as symmetric. Sparse control Hamiltonians 
with fewer than 50% allowed couplings are examined as well in Section I6.CI Control Hamiltonian 
structures containing some allowed and some forbidden transitions qualitatively correspond to coupled- 
spin system qubit structures commonly used in quantum computation, although only certain specified 
distributions of couplings are allowed for qubit systems. In order to investigate the search effort for 
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systems with such control operators, we consider a "tensor product" control Hamiltonian on n qubits, 

(30) 

where the diagonal matrix is added to make the system controllable on U(N) and <j 3 x is 




We consider pairings of this /i with the diagonal H$ operators above |_ . 

As we will demonstrate in Sections 6 and 7, the distribution of couplings between states is important 
for assessing the scaling of effort with N . With this in mind, comparing the sparse and tensor product 
structures /x reveals some important differences, for example at N=8. 
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(31) 



V 

While each operator has an equal number of allowed transitions, the tensor product structure allows no 
transitions more than four states apart (note the zeros in the upper-right and lower-left corners of the 
matrix). The sparse [i example shown, in contrast, allows transitions between states |1) and |8), as well 
as some other transitions five or more states apart. Structural differences in coupling distributions are 
even more evident at 7V=16 and iV=32. We thus define two distinct classes of control Hamiltonians: 
those that allow distant transitions, including the D=1.0 and sparse structures, and those that forbid 
or have very weak distant transitions, including D <1.0, banded, and tensor product structures. The 



4 Systems with multiple control fields, each associated with a different Pauli operator (required for full controllability of 
coupled spins), are considered in a separate work. 
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differences in coupiing distributions between these two ciasscs influence the required search effort, as will 
be shown in Sections 6 and 7. 

The simulations will consider both random Haar-distributed unitary W matrices 32 and the Fourier 
transform quantum gate, 

W FT)n (i,fc) = i r exp(^^) J (32) 

where j and k denote the matrix elements and run from 1 to N = 2 n . 
The initial field at s = is chosen as 

K 

^2 sin (vkt + M, t£ [0, T] (33) 
fc=i 

where {tUk} are the K Fourier components of the field, which are seiected randomly and bounded by the 
frequency of the |1) — > \N) transition in Ho, is a random phase on [0, 2tt], and f 2 is the field fluencc. 
Prior to multiplication by /, the field is normalized to have unit fiuence. 



e{0,t) = /exp 



8tt 



t - 



T 



6 Control Search Complexity 

The search effort required to find an optimal field s(t) has important implications for determining the 
feasibility of controlling the dynamics of complex systems. In Section 16. Ai the influence of the saddle 
point topology of the control landscape is assessed. In Section T6.B1 we examine the search effort as a 
function of N for a broad range of choices of Ho, n, initial field strength /, and W. Further exploration 
of the control Hamiltonian structure's effect on search effort in Section lB.CI identifies control Hamiltonian 
properties that result in the most efficient searches. Details of the numerical parameters in the simulations 
are given in Appendix [C] 

6. A Influence of Landscape Saddle Point Topology 

The simulations in this section address how the landscape topology, which is primarily determined by 
the kinematic cost function F, influences the behavior of gradient-based optimizations for systems of 
dimension up to N—8. In particular, we examine the extent to which the saddle manifolds influence the 
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search trajectory and whether the saddle effects are dependent on the choice of W or the initial control 
field. The Hamiltonian H is given by Eq. (g7]| and n given by Eq. (gHJ) with D=1.0, 0.9, or 0.6. 

The trajectories of three searches for N=A are shown in Figure [TJ Comparison of the saddle metric S 
(c.f., Eq. I24[) in the right panel with the optimization trajectories in the left shows that interaction with 
saddles retards convergence to the optimum. Examination of the trajectory of the Hessian eigenvalues 
during the search confirms interaction with a saddle manifold. The Hessian eigenvalues of the search 
interacting with the J=4 saddle are shown versus J in Figure [2 At J— 4 (dotted vertical line) , there 
are nine positive eigenvalues (marked by circles) and one negative eigenvalue (marked by square), in 
agreement with the Hessian spectrum derived in Eq. (1141) . Furthermore, at the optimum J=0, there are 
16 positive Hessian eigenvalues (marked by small circles), in agreement with the maximally allowed N 2 
positive eigenvalues. The remaining Hessian eigenvalues are null, as predicted [T{?1 |2"5] . 

Table U presents statistics on optimizations using a variety of conditions. 1000 searches starting from 
different initial fields were used to generate the statistics. Shown is the required search effort (defined 
as the number of algorithmic iterations to reach J < 10~ 6 x J max ) as well as the fraction of searches 
that interact with saddles. Three degrees of saddle interaction are examined: S < 0.1, S < 0.05, and 
S < 0.01. The probability of saddle interaction decreases with rising Hilbert space dimension N, such 
that by N=8, negligibly few searches have strong interactions with saddles. The decrease in saddle 
interactions as N rises is favorable to performing large-scale unitary transformation optimizations. 

6.B Scaling of effort with N 

Simulations were performed for N=2, 4, 8, 16, and 32 with a statistical sample size of 20 (with the 
exception for some cases of ./V=32, where a single optimization was performed) and a convergence criterion 
of J < 10~ 3 x J max - The mean search effort with statistical error is shown in Table ITT1 for all optimizations. 
The effort is plotted versus N for D=1.0, 0.9, sparse, and tensor product structures /z with rotor Hq in 
Figure [31 The observed scaling of effort for a fixed fj, structure is similar for both the rotor (Eq. (|2"T|) ) 
and oscillator (Eq. (p^g)) ) H structures, as seen in Table HIl The search effort scaling was found to be 
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strongly dependent on the control Hamiltonian structure. For D=1.0 or sparse /j,, i.e., class that allows 
distant transitions between states, the effort scales slowly with N. In contrast, for the D <1.0 and tensor 
product structure (Eq. (|30l) ). i.e., the class that forbids or has weak distant transitions, the effort scales 
exponentially with N, as shown by the least-square fit lines on the semi-log plot for 13=0.9 and tensor 
product \x structures in Figured! 

The fluence of the initial field has some effect on the absolute search effort, but not on its scaling 
with N (columns 4 and 5 of Table [TTJ) . Increasing the fluence cannot overcome exponential scaling for the 
class of control Hamiltonians that forbids distant transitions. The effort can be reduced to some extent 
by allowing T to scale with N (Table [TTJ) , in agreement with the conclusion that longer control times are 
needed for systems that have few accessible control pathways [33J , but the effort still scales exponentially 
with N. The choice of W (i.e. random unitary or FT gate) does not greatly affect the search effort, as 
shown by comparing the effort to find the FT gate or a random W using the sparse /i structure. 

6.C Control Hamiltonian Structure and Search Effort 

Of all the search parameters explored above, only the control Hamiltonian structure has a systematic 
effect on the scaling of the search effort with N. In order to determine the effects of the structure of \i for 
fixed N (here N=8), we compared control Hamiltonians with D=1.0, 0.9, 0.75, and 0.6, randomly gener- 
ated sparse structures with 14, 10, or 8 allowed transitions, and banded control Hamiltonian structures 
where 2, 3, or 4 rows nearest to the diagonal contain allowed transitions. The resulting search effort is 
plotted versus the norm ||ju|| in Figure|4j which clearly shows that does not determine search effort. 
Rather, the distribution of strong couplings between states is important. 

For a given value of ||/i||, the banded structure has the greatest search effort, followed by the D 
structure, and the sparse structure has the smallest search effort. For approximately the same value of 
~ 5, the search effort varies by over a factor of 10, from 70 iterations for the sparse structure (14 
transitions), through 100 iterations for 15=0.75 structure, to 1200 iterations for the banded structure with 
2 rows of allowed transitions (13 transitions). This indicates that systematic exclusion or suppression of 
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transitions between distant states while allowing only transitions between near states raises the search 
effort, compared to having an equal number of allowed transitions that include some distant transitions. 
Thus, the control Hamiltonian class that allows distant transitions is expected to have a lower search 
effort than the class that forbids distant transitions. This observation is consistent with the observed 
exponential scaling of the search effort with N for D <1 and tensor product structures. The reasons 
behind the dependence of the search effort on the control Hamiltonian structure will be explored in 
Section [3 

7 Search Effort and Landscape Geometry 

Here, we assess the local landscape features in terms of the metrics in Section @] for unitary propagator 
control. We first consider the structure of the landscape in terms of the local metrics in Section 17. Al 
The effects of the landscape structure on the search trajectories, as defined by the directness metric R £ 
and the Gramian matrix, are examined in Section 17. Bl 

7. A Local Landscape Structure 

The bound on the slope metric Q derived in Section 2] was found to be conservative. The recorded 
maximal slope metric Q was always significantly below this bound and observed to grow linearly with 
N, while the bound grows quadratically with N (not shown). The analytically derived Hessian trace at 
J=0 (c.f. Eq. 127)1) was found to hold; the deviation at the optimum was always less than 0.001% when 
the convergence criterion J < le — 6 J m ax was used. The Hessian trace does not predict search effort 
regardless of where it is measured, since it is only dependent on \\ji\\ or \\n 2 \\, and the effort can vary 
widely for different /i structures with similar values of \\/j,\\ (c.f., FigureHJ. 

The slope metric Q and the local curvature C were found to correlate with search effort. The statistical 
distribution of the maximal slope metric Q m ax over the search samples is plotted versus N in Figure 0(a) . 
For D=1.0, the maximal slope rises linearly with N, but the growth with N is slower for D <1.0 and 
tensor product control Hamiltonians. Since the maximal slope metric for any search is often recorded at 
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or near the initial field (depending on the exact choice of e(0, £)), an estimate of search effort scaling with 
N for any /z structure can be made simply by measuring the gradient at random initial fields for systems 
of different N with the same type of control Hamiltonian structure: a linearly increasing Q m ax with N 
indicates minimal scaling with N, while sub-linear increase of Qmax indicates exponential scaling with 
N. A more accurate prediction of the search effort can be made by measuring the Q near, but not at, 
the optimum. Figure [5] shows the absolute search effort for searches using different D structures plotted 
versus the measured value of Q at J=2 and J=0.01. Even at J=2, the gradient provides a good estimate 
of the absolute effort. The local curvature C at the optimum is also an indicator of search effort scaling, as 
shown in Figure [S^b). For control Hamiltonians that allow distant transitions and have sub-exponential 
scaling of effort with N (D=1.0 and sparse structures), the curvature is flat as N increases from 4 to 16. 
For control Hamiltonians that forbid distant transitions and have exponential scaling (D <1.0 and the 
tensor product structures), the curvature decreases with N according to a power law (note the log- log 
plot). Under all circumstances, a smaller value of C near the optimum indicates a greater search effort. 

To understand the effects of Hessian local curvature on convergence efficiency, note that near a stable 
fixed point e(t) of J on the global optimum submanifold, (i.e., for \\s(t) — e(t)|| < e), the objective 
function is approximately quadratic in e(t), and we can linearize the differential equation (|15[) around 
e(t) as 

^--J? H..^*) ^-™*- (34) 

To facilitate the convergence analysis, we assume that J(e) — > J(e) + X L(e(t)) dt. In the limit 
A — » 0, the cost functional is Mayer. For sufficiently small nonzero A, i.e., A << 1, the cost functional 
is Bolza and the optimal control problem has a unique solution s(t). Then, the Hessian se(t)Se(f) IS 
positive definite and by the Hartman-Grobman theorem for hyperbolic fixed points [34] . locally near the 
optimum, e(s,t) converges exponentially to e(t) at a rate that is bounded by the smallest eigenvalue of 
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the Hessian of the linearized system: 



\s(s,t) - e(t)\\ 



T 



exp 



o 



5 2 J 



Se(s,t)Se(s,t') 
< exp[-A min s] | |er(0, *) - e(t)\\ 



(e(0,t') - e{t')) dt' 



Since this eigenvalue \ m i n increases with local landscape curvature, higher curvature near the optimum 
facilitates convergence, for both the perturbed and original optimization problems. 

7.B Complexity of Search Trajectories 

The ratio R £ measures the degree to which the search trajectory deviates from a direct path between 
the initial and final control fields. A statistical examination of R £ for the optimizations performed in 
Section IB . B I shows that an increase in search effort with N correlates with an increase in R e with TV (not 
shown). Nevertheless, the ratio is always small (R £ < 3), indicating that while a linear trajectory from 
initial to final field cannot be followed, the trajectories followed are relatively direct. 

Further insight into the effect of the search trajectory on the required effort can be gained by exam- 
ining the evolution of R e over the search trajectory (i.e., with respect to the value of J). Three cases 
at N=S starting from the same initial field (fluence f—l with 10 evenly spaced frequency components) 
illustrate the difference in the complexity of the search trajectories. With the rotor Hq structure, the 
control Hamiltonians used are (a) fully coupled (called "flat" here), (b) sparse with 50% allowed transi- 
tions, and (c) banded with two off-diagonal bands. The trajectories of the ratio R £ are plotted in Figure 
UJ These results show that the optimization with the flat structure takes a direct path from initial to 
final field, while the optimizations with sparse and banded structures must change direction to optimize 
below J—l. In the vicinity of the optimum below J=0.1, the sparse /i optimization again can follow a 
direct path, while the trajectory for the banded /i continues to change direction. 

Examination of the Fourier spectra of the fields along the search trajectories reveals how field modes 
required for propagator control are progressively generated by local optimization. The spectra of the 
initial field, field at J=l, and optimal field for the three searches above are shown in Figure [51 At 
J—l, the fields have higher fluence and enhanced specific frequencies, particularly with low-frequency 
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components for the banded structure since only near transitions are allowed. For the banded structure, 
many new frequencies are added or greatly enhanced when going from J=l to the optimum. In contrast, 
all necessary frequencies are present at J=l for searches with the sparse and flat /x structures. 

The origin of the more complicated search trajectories and optimal fields for the banded fi struc- 
ture can be explained by examining the condition number of the Gramian matrix ^ along the search 
trajectory. Consistent with the analysis in Section T4.D1 motion in certain directions on U(N) - such as 
those necessary to reach W - is achieved more slowly for poorly conditioned Gramian matrices than for 
well-conditioned Gramian matrices. The trajectories of the Gramian matrix condition number for the 
three searches above are shown in Figure OJa). For comparison, the trajectories using an initial field 
of /=10 with the same frequencies is shown in Figure [Hfb). The condition number remains orders of 
magnitude higher for the banded fi than for the sparse and flat /i throughout the search. Furthermore, 
the condition number levels off at a value under 100 for the sparse and flat fi structures, but remains 
well above 1000 for the banded structure, and displays more oscillations at J < 1. 

As discussed in Section [4. Dl the Gramian matrix is typically more well-conditioned for controls e(t) 
where all the terms in the Dyson series required for full controllability have become populated. Compared 
to the sparse /i, the banded fi requires higher-order terms in the Dyson series to produce the desired 
W, and hence more frequency components in the optimal field. The flat /i structure requires the fewest 
Dyson terms, and thus shows the smallest difference between the fields at J=l and the optimum. For a 
given distance ||e(s,i) — s(t)\ \ from the global optimum, the accuracy of the linear approximation (|34|) is 
greater for control systems with lower dynamical Lie algebra depth, due to lower order nonlinearity of the 
optimization problem. Away from critical points (outside the quadratic region), all terms in the Dyson 
series (|26p required to reach W must be optimized, by the successive addition of new linear combinations 
of the real and imaginary components of (i\[i(t)\j) at each step. 
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8 Conclusion 

We have provided a unifying picture of the convergence efficiency of first-order algorithms for unitary 
transformation control in terms of critical landscape topology and gradient flow dynamics. The roles 
of kinematic and system-dependent factors have been assessed. The results show that understanding 
the landscape topology is insufficient for predicting the required search effort to find an optimal field. 
Thus, in this work we have defined local landscape structure metrics based on a series expansion of the 
cost function variation with respect to the control, and have demonstrated that the first-order gradient- 
based metrics can qualitatively predict the required search effort. A central conclusion is that for control 
systems with low dynamical Lie algebra depth, the convergence efficiency is kinematically driven and 
any of the common first-order control optimization approaches based on unconstrained fields and a 
Mayer-type cost functional can be effective. In these cases, local gradient-based search algorithms can 
efficiently navigate the landscape for control of arbitrary unitary transformations, assuming that the 
system is controllable. In contrast, first-order algorithms are inefficient for systems of high Lie algebra 
depth. Future work should be aimed at quantitative classification of common gate control systems in 
terms of Lie algebra depth and identification of alternate search methods for systems of high Lie algebra 
depth. 

The numerical results demonstrate that the control Hamiltonian structure determines the scaling 
of the required search effort with the Hilbcrt space dimension N for optimization of arbitrary unitary 
transformations. In particular, control Hamiltonians [i that permit transitions between distant quantum 
states (e.g., the D=l and sparse structures studied here) exhibit weak scaling of effort with N. For 
these systems, the first-order landscape structure metrics either grow linearly with TV (maximal gradient 
norm) or are invariant to ./V (gradient norm near optimum), and second-order landscape structure metrics 
are invariant to N. The gradient flow was shown to be kinematically driven based on the Gramian 
matrix being well-conditioned throughout the search trajectory. Such systems have a low dynamical 
Lie algebra depth and are amenable to efficient first-order control optimization. In contrast, systems 
where transitions between distant states are weak or forbidden (e.g., D <I.O, banded, and tensor product 
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structures) require an exponentially increasing search effort with N. The landscape structure metrics 
exhibit corresponding behavior, with the maximal gradient norm scaling sub-linearly with N and the 
gradient norm near the optimum and second order metrics decreasing exponentially with N. Such 
systems have a greater dynamical Lie algebra depth, requiring higher-order terms in the Dyson expansion 
and exhibiting less well-conditioned Gramian matrices. Optimizations with these systems deviate from 
expected kinematic behavior, indicating that dynamical effects drive the gradient flow. 

The results here suggest that for optimally controlling quantum gates, it is necessary to consider 
features of the control system other than controllability when engineering the time-independent Hamil- 
tonian. In particular, the observation of exponential scaling with Hilbert space dimension for the D and 
tensor product structures suggests that Lie algebra depth of Hq and /i should be considered, with the 
engineering goal being to make transitions between distant quantum states allowed. Such Hamiltonian 
design efforts may be facilitated by the methods of Hamiltonian morphing [35] , which allow the control 
Hamiltonian to be continuously deformed while holding the gate fidelity and the control field fixed. 

Although the control of arbitrary unitary propagators is of fundamental importance, the primary 
focus of OCT studies of propagator control is for specific applications to quantum information sci- 
ences. Systems for which the landscape search complexity and resource scaling are favorable may be 
particularly useful for directly implementing multiqubit operations rather than decomposing them into 
sequences of one- and two-qubit universal gates. Search complexity for optimal control of multiqubit 
gates may thus be mitigated by choosing quantum information processing implementations where the 
control Hamiltonian can be tuned by design, with the goal being to produce control Hamiltonians that 
allow transitions between distant states. An example is quantum computation with polar molecule arrays 
in a magneto-optical trap |36j , where photoassociation techniques can be used to assemble novel atomic 
(e.g., homonuclear and heteronuclear alkali metal) dimers with differing permanent dipole moments. In 
such implementations the static electric field gradient that renders the molecules individually address- 
able can be used to orient the molecules so that the dipole-dipole coupling can be tuned, and qubits 
can be encoded on either ground or excited rovibrational states. Investigation of the scope of possible 
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multiqubit control Hamiltonian structures accessible using such methods, and the application of OCT 
to these systems, is motivated by the present work. 
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A Stable fixed point topology of PMP-iterative propagator con- 
trol algorithms 

The first-order algorithms below can only be formulated in discrete algorithmic time |37| . A basic 
PMP-iterative algorithm proceeds via the following steps: 

i^Mt) = {H -fi- e k (t))4> k (t), 4>k(T) = VF(t^ fc _i(T)) 

i^U k (t) = (Hq — (j, ■ e k (t))U k (t), U k (0) = U (0) 
at 

where the costate equation is propagated backwards in time, with 

i k (t) = a k e k -i(t) + /3 k (MtMU k -i(t)) = a k e k -i(t) - (3 k Tr ^ul{T)VF{U k ^{T))ul{t)^U k ^{t) 
e k {t) = a k e k {t) + p k {<t> k {t)\fi\U k {t)) (35) 

The constants, a k £ [0,1] and f3 k < (for minimization of J), can in principle be chosen to be 
different in the s k (t),e k (t) updates [37]. The assignments of the constants a,/3 determine which type of 
cost functional J is optimized by the algorithm. In particular, the following values of a are of interest^ : 

1. a k = 0, /3 k < minimizes Bolza cost J = F(U(T)) + \ / Q T ' e 2 (t) dt 



5 Other choices for a k ,0 k ~ or modifications to equation (1356 - can be used to optimize either Bolza or Mayer costs, 
and are often required to ensure monotonic convergence of the algorithm, as discussed in |37| . In particular, if a k ,/3 k are 
selected outside of the intervals above, their values may not be independent. 
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2. a k = 1, A < minimizes Mayer cost 

Iterative algorithms of type 2 have been applied [38l[5j[6] to the problem of optimal gate control, [f . 

Prior work has demonstrated that PMP-iterative algorithms for quantum control converge monoton- 
ically (i.e., 6J k ,k-i < at each step). However, neither convergence to a global versus local optimum, 
nor the rate of convergence were studied. The fixed points of type 2 discrete time PMP-iterative algo- 
rithms are points on the control landscape where e k — e k — or e k — £fc-i = ((4>k(t)\ii\Uk{t)) = or 
(<fik(t)\ii\Uk-i(t)) — 0). In order for all such points to lie on the critical manifolds identified in Section 
2.2, we must require that the Gramian matrices 



U k (T)u£(t)ijU k -. 1 (t) v T ul_ x (t)nU k (t)ul{T) dt, / v[U k {T)ii k {t)]v T n k {t)Ul(T) 



T 



dt 



are nonsingular at successive steps of the algorithm. Then, the only fixed points of the discrete time 
dynamical system correspond to points U k where VuF(U(T)) = 0. Thus, if the HQF in Section T2.BI 
is positive definite at U k (T) — U, then lim U k (T) — U, for some e that is equal to the radius of the 
attracting region of the critical point U, As shown in Section [3j the only critical point of F(U) that 
satisfies this criterion for asymptotic convergence is U = W and the associated neutrally stable controls 
s(t) lie on the global minimum submanifold of J(e(-)). 



B Hessian quadratic form and rank 

The explicit form of the matrix A 2 in the HQF expression (fTTj) can be obtained from the second variation 
in the Taylor expansion of J(e + Se); we find 

A 2 = f [ Se(t) ■ fi(t)fi(t') ■ Se(t r ) dt dt'. 
Jo Jo 

Assuming that the Gramian ^ is nonsingular, i.e., that the real and imaginary components of the ele- 
ments of are linearly independent functions of time, A can be any Hermitian matrix with associated 

6 Unlike homotopy algorithms, neither type of iterative algorithm introduced thus far can minimize field fluence while 
reaching high gate fidelity. 
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direction U(T)A in the tangent space TijU(N) to the unitary group L. 

The range of % is spanned by eigenfunctions of the Hessian kernel: these eigenfunctions /„ (t) , which 
satisfy ^ %{t,t')f u {t')dt' — \ v f v (t), are linear combinations of products of the real and imaginary 
components $t{i\(j.(t)\j) , ^s(i\fi(t)\j) of the time-evolved dipole operator, i.e., 

n(t,t>) = f ^iMMm+^mm ) \ Y, a ^( k wm+a' k ^(kHt')\i) ) , (36) 
\j>i j \i>k j 



where the expansion coefficients can be computed from equation (jTSJ) . This immediately implies that 
H(t, t') is a finite rank kernel, with rank W(t, t') < N 2 , even away from critical points where the HQF in 
Section 2 cannot be used to assess rank. 



C Numerical details 

The control field e(t) was discretized on a time interval t € [0, T] in arbitrary dimensionless units into 
a sufficient number of time points to resolve the |1) — > \N) transition frequency in Hq. For the rotor 
Hamiltonian $F7$ with T=14, 512 points were used for N < 8, 2048 points for N=16, and 4096 points 
for N=32. For the oscillator Hamiltonian (|28D . 512 points were used for N < 8, 1024 points for iV=16, 
and 2048 points for iV=32 were used. When T > 14, 4096 points were used. For the simulations in 
Sections [51 the initial control field e(0,t) contained K=20 Fourier components randomly chosen from a 
uniform distribution on an interval [0,wijv]j where ujin denotes the |1) — > |JV) transition frequency. 

Reported search effort is the number of RK4 algorithm iterations required to attain a J value below 
the convergence criterion. This was J < 10~ 6 x J max for all simulations except those in Section f6.BL 
where the criterion was J < 10~ 3 x J ma x- The sample size to generate the reported statistics was 1000 
for the simulations in Section 16. Al and 20 for the simulations in Sections 16. Bl I6.C1 and [7J In Tables 1 
and 2, the mean effort and standard deviation are reported. For the particularly difficult optimizations 
shown in Table 2 reporting no standard deviation, only one search was performed. In Figures [3] and [H 

7 Note the third term in equation (1186 does not contribute to the second order variation and hence does not have a 
corresponding term in the HQF; this is consistent with the fact that the second order variation is a quadratic form only at 
critical points, and cannot be used to assess the definiteness of the Hessian away from such points. 
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the error bars report the left and right standard deviations. 

The metrics in Section U were calculated by approximating the integrals as sums over the discretized 
time-points. 
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0.109 (63.3) 


0.016 (66.3) 






Table I: Search effort mean/standard deviation and probability of encountering saddles. For the column 
labeled W, In denotes the iV-dimensional identity matrix and FT denotes the quantum Fourier transform 
gate. / denotes the initial field strength. Effort is the number of iterations required to achieve J < 
10~ 6 x J m ax- The last three columns show the fraction of searches to encounter saddle manifolds to 
within S values of 0.1, 0.05, and 0.01; the numbers in parentheses denote the mean search effort for 
searches that encountered these saddles. 
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Figure 1: Sample search trajectories for N=4. Left panel: J value versus iteration. The solid-line 
trajectory goes directly from the initial J value to J—Q, while the dashed and dotted trajectories slow 
down at J=4 and J=8, respectively, suggesting interaction with the corresponding saddles. Right panel: 
Saddle metric versus J for the same searches. 
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Table II: Scaling of search effort with TV for different choices of Ho, H structure, initial field strength /, 
time T, and target W. 
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J value 

Figure 2: Hessian eigenvalues versus J value for the dashed-line trajectory of Figure [TJ This search 
encountered the J— A saddle, shown by the nine positive Hessian eigenvalues (labeled by large circles) 
and one negative eigenvalue (labeled by a square). At the optimum, there are 16 positive Hessian 
eigenvalues (small circles). All values are unitless. 
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N 

Figure 3: Mean search effort (algorithmic iterations) with left and right standard deviation versus N for 
rotor Hq with /i structures D=1.0 (squares), £>=0.9 (circles), sparse (up triangles), and tensor product 
(right triangles). 
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Figure 4: Mean search effort versus norm \\fi\ \ for N=8 with control Hamiltonian structures D (c.f. Eq. 
[23 squares), sparse (triangles), and banded (circles). 




Figure 5: (a) Mean value of the maximal slope metric Gmax versus N for D=1.0 (squares), tensor 
product (circles), D=0.6 (up triangles), and sparse n structure (side triangles), (b) Mean curvature C at 
the optimum versus N. The rotor Hq structure is used for all searches. All values are unitless. 
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gradient norm 

Figure 6: Search effort versus slope metric measured at J=0.01 and ,7=2. The effort scales with slope 
metric according to a power law, as seen by the least squares lines on the log- log plot. All values are 
unitless. 
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Figure 7: Ratio R £ as a function of J value for searches beginning from the same initial held with flat, 
sparse, and banded /i structures. The inset shows the path length for these searches as a function of J 
value. 
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Figure 8: Fourier spectra of initial field (dotted line), field at ,7=1 (thick line), and optimal field (thin 
line) for banded structure (left), sparse structure (bottom right), and flat structure (top right). The 
ordinate is of the same scale for all plots. All values are unitless. 
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Figure 9: Gramian matrix condition number versus J value for searches beginning from the same initial 
field at /=! (left) and /=10 (right) for the flat, banded, and sparse fj, structures. 
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